Overview

Dataset statistics

Number of variables15
Number of observations8659
Missing cells0
Missing cells (%)0.0%
Duplicate rows328
Duplicate rows (%)3.8%
Total size in memory896.5 KiB
Average record size in memory106.0 B

Variable types

Boolean2
Categorical5
Numeric8

Alerts

Dataset has 328 (3.8%) duplicate rowsDuplicates
Cabin_deck is highly overall correlated with HomePlanetHigh correlation
Consumption_Basic is highly overall correlated with Consumption_High_End and 4 other fieldsHigh correlation
Consumption_High_End is highly overall correlated with Consumption_Basic and 4 other fieldsHigh correlation
CryoSleep is highly overall correlated with FoodCourt and 4 other fieldsHigh correlation
FoodCourt is highly overall correlated with Consumption_Basic and 3 other fieldsHigh correlation
HomePlanet is highly overall correlated with Cabin_deckHigh correlation
RoomService is highly overall correlated with Consumption_High_End and 1 other fieldsHigh correlation
ShoppingMall is highly overall correlated with Consumption_Basic and 1 other fieldsHigh correlation
Spa is highly overall correlated with Consumption_Basic and 2 other fieldsHigh correlation
VRDeck is highly overall correlated with Consumption_Basic and 3 other fieldsHigh correlation
VIP is highly imbalanced (84.5%)Imbalance
RoomService has 5645 (65.2%) zerosZeros
FoodCourt has 5541 (64.0%) zerosZeros
ShoppingMall has 5684 (65.6%) zerosZeros
Spa has 5398 (62.3%) zerosZeros
VRDeck has 5583 (64.5%) zerosZeros
Consumption_High_End has 3805 (43.9%) zerosZeros
Consumption_Basic has 4164 (48.1%) zerosZeros

Reproduction

Analysis started2024-05-07 12:02:54.792678
Analysis finished2024-05-07 12:03:13.903874
Duration19.11 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

CryoSleep
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
False
5537 
True
3122 
ValueCountFrequency (%)
False 5537
63.9%
True 3122
36.1%
2024-05-07T14:03:14.052339image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Destination
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.8 KiB
TRAPPIST-1e
6071 
55 Cancri e
1793 
PSO J318.5-22
795 

Length

Max length13
Median length11
Mean length11.183624
Min length11

Characters and Unicode

Total characters96839
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTRAPPIST-1e
2nd rowTRAPPIST-1e
3rd rowTRAPPIST-1e
4th rowTRAPPIST-1e
5th rowTRAPPIST-1e

Common Values

ValueCountFrequency (%)
TRAPPIST-1e 6071
70.1%
55 Cancri e 1793
 
20.7%
PSO J318.5-22 795
 
9.2%

Length

2024-05-07T14:03:14.261789image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-07T14:03:14.478950image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
trappist-1e 6071
46.6%
55 1793
 
13.8%
cancri 1793
 
13.8%
e 1793
 
13.8%
pso 795
 
6.1%
j318.5-22 795
 
6.1%

Most occurring characters

ValueCountFrequency (%)
P 12937
13.4%
T 12142
12.5%
e 7864
 
8.1%
S 6866
 
7.1%
- 6866
 
7.1%
1 6866
 
7.1%
A 6071
 
6.3%
I 6071
 
6.3%
R 6071
 
6.3%
5 4381
 
4.5%
Other values (13) 20704
21.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 96839
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 12937
13.4%
T 12142
12.5%
e 7864
 
8.1%
S 6866
 
7.1%
- 6866
 
7.1%
1 6866
 
7.1%
A 6071
 
6.3%
I 6071
 
6.3%
R 6071
 
6.3%
5 4381
 
4.5%
Other values (13) 20704
21.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 96839
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 12937
13.4%
T 12142
12.5%
e 7864
 
8.1%
S 6866
 
7.1%
- 6866
 
7.1%
1 6866
 
7.1%
A 6071
 
6.3%
I 6071
 
6.3%
R 6071
 
6.3%
5 4381
 
4.5%
Other values (13) 20704
21.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 96839
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 12937
13.4%
T 12142
12.5%
e 7864
 
8.1%
S 6866
 
7.1%
- 6866
 
7.1%
1 6866
 
7.1%
A 6071
 
6.3%
I 6071
 
6.3%
R 6071
 
6.3%
5 4381
 
4.5%
Other values (13) 20704
21.4%

VIP
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
False
8464 
True
 
195
ValueCountFrequency (%)
False 8464
97.7%
True 195
 
2.3%
2024-05-07T14:03:14.651528image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

RoomService
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1361
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean221.63415
Minimum0
Maximum14327
Zeros5645
Zeros (%)65.2%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:14.879196image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q355
95-th percentile1256.2
Maximum14327
Range14327
Interquartile range (IQR)55

Descriptive statistics

Standard deviation648.26018
Coefficient of variation (CV)2.924911
Kurtosis67.75414
Mean221.63415
Median Absolute Deviation (MAD)0
Skewness6.3564064
Sum1919130.1
Variance420241.26
MonotonicityNot monotonic
2024-05-07T14:03:15.126060image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5645
65.2%
1 117
 
1.4%
2 78
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
9 25
 
0.3%
8 24
 
0.3%
6 24
 
0.3%
14 21
 
0.2%
Other values (1351) 2589
29.9%
ValueCountFrequency (%)
0 5645
65.2%
1 117
 
1.4%
1.077766015 1
 
< 0.1%
2 78
 
0.9%
3 61
 
0.7%
4 47
 
0.5%
5 28
 
0.3%
6 24
 
0.3%
7 17
 
0.2%
8 24
 
0.3%
ValueCountFrequency (%)
14327 1
< 0.1%
9920 1
< 0.1%
8586 1
< 0.1%
8243 1
< 0.1%
8209 1
< 0.1%
8168 1
< 0.1%
8142 1
< 0.1%
8030 1
< 0.1%
7406 1
< 0.1%
7172 1
< 0.1%

FoodCourt
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1577
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean427.79642
Minimum0
Maximum29813
Zeros5541
Zeros (%)64.0%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:15.356038image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q381.5
95-th percentile2561.6
Maximum29813
Range29813
Interquartile range (IQR)81.5

Descriptive statistics

Standard deviation1502.8659
Coefficient of variation (CV)3.5130398
Kurtosis85.968892
Mean427.79642
Median Absolute Deviation (MAD)0
Skewness7.5336215
Sum3704289.2
Variance2258605.9
MonotonicityNot monotonic
2024-05-07T14:03:15.597410image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5541
64.0%
1 116
 
1.3%
2 75
 
0.9%
4 53
 
0.6%
3 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
9 28
 
0.3%
7 27
 
0.3%
10 27
 
0.3%
Other values (1567) 2675
30.9%
ValueCountFrequency (%)
0 5541
64.0%
1 116
 
1.3%
2 75
 
0.9%
3 53
 
0.6%
4 53
 
0.6%
5 33
 
0.4%
6 31
 
0.4%
7 27
 
0.3%
8 20
 
0.2%
9 28
 
0.3%
ValueCountFrequency (%)
29813 1
< 0.1%
27723 1
< 0.1%
27071 1
< 0.1%
26830 1
< 0.1%
18481 1
< 0.1%
17958 1
< 0.1%
17901 1
< 0.1%
17687 1
< 0.1%
17432 1
< 0.1%
17394 1
< 0.1%

ShoppingMall
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1201
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean169.56456
Minimum0
Maximum23492
Zeros5684
Zeros (%)65.6%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:15.809011image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q330
95-th percentile921
Maximum23492
Range23492
Interquartile range (IQR)30

Descriptive statistics

Standard deviation573.09382
Coefficient of variation (CV)3.3797971
Kurtosis373.75265
Mean169.56456
Median Absolute Deviation (MAD)0
Skewness13.003727
Sum1468259.6
Variance328436.53
MonotonicityNot monotonic
2024-05-07T14:03:16.078696image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5684
65.6%
1 152
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
7 35
 
0.4%
6 34
 
0.4%
13 29
 
0.3%
9 28
 
0.3%
Other values (1191) 2475
28.6%
ValueCountFrequency (%)
0 5684
65.6%
1 152
 
1.8%
2 80
 
0.9%
3 59
 
0.7%
4 45
 
0.5%
5 38
 
0.4%
6 34
 
0.4%
7 35
 
0.4%
8 28
 
0.3%
9 28
 
0.3%
ValueCountFrequency (%)
23492 1
< 0.1%
12253 1
< 0.1%
9058 1
< 0.1%
7810 1
< 0.1%
7185 1
< 0.1%
7148 1
< 0.1%
7104 1
< 0.1%
6805 1
< 0.1%
6331 1
< 0.1%
6221 1
< 0.1%

Spa
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1404
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean284.87394
Minimum0
Maximum16139
Zeros5398
Zeros (%)62.3%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:16.281591image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q364
95-th percentile1520.5
Maximum16139
Range16139
Interquartile range (IQR)64

Descriptive statistics

Standard deviation983.61954
Coefficient of variation (CV)3.4528239
Kurtosis66.755155
Mean284.87394
Median Absolute Deviation (MAD)0
Skewness6.943894
Sum2466723.4
Variance967507.41
MonotonicityNot monotonic
2024-05-07T14:03:16.501505image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5398
62.3%
1 146
 
1.7%
2 105
 
1.2%
5 53
 
0.6%
3 53
 
0.6%
4 46
 
0.5%
7 34
 
0.4%
6 33
 
0.4%
9 28
 
0.3%
8 28
 
0.3%
Other values (1394) 2735
31.6%
ValueCountFrequency (%)
0 5398
62.3%
1 146
 
1.7%
2 105
 
1.2%
3 53
 
0.6%
4 46
 
0.5%
4.560684576 1
 
< 0.1%
5 53
 
0.6%
5.642490463 1
 
< 0.1%
6 33
 
0.4%
7 34
 
0.4%
ValueCountFrequency (%)
16139 1
< 0.1%
15586 1
< 0.1%
15331 1
< 0.1%
15238 1
< 0.1%
13995 1
< 0.1%
13104 1
< 0.1%
12062 1
< 0.1%
11001 1
< 0.1%
10976 1
< 0.1%
10941 1
< 0.1%

VRDeck
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1381
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean285.62515
Minimum0
Maximum24133
Zeros5583
Zeros (%)64.5%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:16.711473image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q349.5
95-th percentile1457.2
Maximum24133
Range24133
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation1056.8989
Coefficient of variation (CV)3.7003007
Kurtosis97.349881
Mean285.62515
Median Absolute Deviation (MAD)0
Skewness8.0971526
Sum2473228.2
Variance1117035.4
MonotonicityNot monotonic
2024-05-07T14:03:17.051635image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5583
64.5%
1 138
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
5 51
 
0.6%
4 47
 
0.5%
6 32
 
0.4%
8 30
 
0.3%
7 29
 
0.3%
9 25
 
0.3%
Other values (1371) 2598
30.0%
ValueCountFrequency (%)
0 5583
64.5%
1 138
 
1.6%
2 70
 
0.8%
3 56
 
0.6%
4 47
 
0.5%
5 51
 
0.6%
6 32
 
0.4%
7 29
 
0.3%
8 30
 
0.3%
9 25
 
0.3%
ValueCountFrequency (%)
24133 1
< 0.1%
20336 1
< 0.1%
17074 1
< 0.1%
16337 1
< 0.1%
12708 1
< 0.1%
12682 1
< 0.1%
12424 1
< 0.1%
12392 1
< 0.1%
12323 1
< 0.1%
12143 1
< 0.1%

Cabin_deck
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size67.8 KiB
F
2868 
G
2615 
E
875 
B
804 
C
752 
Other values (3)
745 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8659
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowF
3rd rowA
4th rowA
5th rowF

Common Values

ValueCountFrequency (%)
F 2868
33.1%
G 2615
30.2%
E 875
 
10.1%
B 804
 
9.3%
C 752
 
8.7%
D 483
 
5.6%
A 257
 
3.0%
T 5
 
0.1%

Length

2024-05-07T14:03:17.277721image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-07T14:03:17.434656image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
f 2868
33.1%
g 2615
30.2%
e 875
 
10.1%
b 804
 
9.3%
c 752
 
8.7%
d 483
 
5.6%
a 257
 
3.0%
t 5
 
0.1%

Most occurring characters

ValueCountFrequency (%)
F 2868
33.1%
G 2615
30.2%
E 875
 
10.1%
B 804
 
9.3%
C 752
 
8.7%
D 483
 
5.6%
A 257
 
3.0%
T 5
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 2868
33.1%
G 2615
30.2%
E 875
 
10.1%
B 804
 
9.3%
C 752
 
8.7%
D 483
 
5.6%
A 257
 
3.0%
T 5
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 2868
33.1%
G 2615
30.2%
E 875
 
10.1%
B 804
 
9.3%
C 752
 
8.7%
D 483
 
5.6%
A 257
 
3.0%
T 5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 2868
33.1%
G 2615
30.2%
E 875
 
10.1%
B 804
 
9.3%
C 752
 
8.7%
D 483
 
5.6%
A 257
 
3.0%
T 5
 
0.1%

Group_size
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0349925
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:17.620785image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5964559
Coefficient of variation (CV)0.78450211
Kurtosis3.1734359
Mean2.0349925
Median Absolute Deviation (MAD)0
Skewness1.8903565
Sum17621
Variance2.5486714
MonotonicityNot monotonic
2024-05-07T14:03:17.813884image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 4790
55.3%
2 1671
 
19.3%
3 1018
 
11.8%
4 410
 
4.7%
5 263
 
3.0%
7 230
 
2.7%
6 173
 
2.0%
8 104
 
1.2%
ValueCountFrequency (%)
1 4790
55.3%
2 1671
 
19.3%
3 1018
 
11.8%
4 410
 
4.7%
5 263
 
3.0%
6 173
 
2.0%
7 230
 
2.7%
8 104
 
1.2%
ValueCountFrequency (%)
8 104
 
1.2%
7 230
 
2.7%
6 173
 
2.0%
5 263
 
3.0%
4 410
 
4.7%
3 1018
 
11.8%
2 1671
 
19.3%
1 4790
55.3%

HomePlanet
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.8 KiB
Earth
4709 
Europa
2141 
Mars
1809 

Length

Max length6
Median length5
Mean length5.0383416
Min length4

Characters and Unicode

Total characters43627
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEuropa
2nd rowEarth
3rd rowEuropa
4th rowEuropa
5th rowEarth

Common Values

ValueCountFrequency (%)
Earth 4709
54.4%
Europa 2141
24.7%
Mars 1809
 
20.9%

Length

2024-05-07T14:03:18.006399image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-07T14:03:18.194669image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
earth 4709
54.4%
europa 2141
24.7%
mars 1809
 
20.9%

Most occurring characters

ValueCountFrequency (%)
a 8659
19.8%
r 8659
19.8%
E 6850
15.7%
t 4709
10.8%
h 4709
10.8%
u 2141
 
4.9%
o 2141
 
4.9%
p 2141
 
4.9%
M 1809
 
4.1%
s 1809
 
4.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43627
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 8659
19.8%
r 8659
19.8%
E 6850
15.7%
t 4709
10.8%
h 4709
10.8%
u 2141
 
4.9%
o 2141
 
4.9%
p 2141
 
4.9%
M 1809
 
4.1%
s 1809
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43627
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 8659
19.8%
r 8659
19.8%
E 6850
15.7%
t 4709
10.8%
h 4709
10.8%
u 2141
 
4.9%
o 2141
 
4.9%
p 2141
 
4.9%
M 1809
 
4.1%
s 1809
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43627
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 8659
19.8%
r 8659
19.8%
E 6850
15.7%
t 4709
10.8%
h 4709
10.8%
u 2141
 
4.9%
o 2141
 
4.9%
p 2141
 
4.9%
M 1809
 
4.1%
s 1809
 
4.1%

Transported
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.8 KiB
1
4375 
0
4284 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8659
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Length

2024-05-07T14:03:18.364176image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-07T14:03:18.546173image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Most occurring characters

ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8659
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 4375
50.5%
0 4284
49.5%

Consumption_High_End
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2540
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean792.13324
Minimum0
Maximum25463.229
Zeros3805
Zeros (%)43.9%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:18.726488image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median80
Q3857
95-th percentile3660
Maximum25463.229
Range25463.229
Interquartile range (IQR)857

Descriptive statistics

Standard deviation1643.9217
Coefficient of variation (CV)2.0753096
Kurtosis29.791174
Mean792.13324
Median Absolute Deviation (MAD)80
Skewness4.5050317
Sum6859081.7
Variance2702478.6
MonotonicityNot monotonic
2024-05-07T14:03:18.954775image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3805
43.9%
1 26
 
0.3%
2 23
 
0.3%
4 22
 
0.3%
3 20
 
0.2%
5 18
 
0.2%
7 16
 
0.2%
804 15
 
0.2%
11 15
 
0.2%
6 13
 
0.2%
Other values (2530) 4686
54.1%
ValueCountFrequency (%)
0 3805
43.9%
1 26
 
0.3%
1.077766015 1
 
< 0.1%
2 23
 
0.3%
3 20
 
0.2%
4 22
 
0.3%
5 18
 
0.2%
6 13
 
0.2%
7 16
 
0.2%
7.395469138 1
 
< 0.1%
ValueCountFrequency (%)
25463.22895 1
< 0.1%
20961 1
< 0.1%
18037 1
< 0.1%
17928 1
< 0.1%
16826 1
< 0.1%
16762 1
< 0.1%
16394 1
< 0.1%
16059 1
< 0.1%
15758 1
< 0.1%
14695 1
< 0.1%

Consumption_Basic
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2110
Distinct (%)24.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean597.36099
Minimum0
Maximum29813
Zeros4164
Zeros (%)48.1%
Negative0
Negative (%)0.0%
Memory size67.8 KiB
2024-05-07T14:03:19.232314image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q3608
95-th percentile3070.6
Maximum29813
Range29813
Interquartile range (IQR)608

Descriptive statistics

Standard deviation1599.5818
Coefficient of variation (CV)2.6777474
Kurtosis71.155004
Mean597.36099
Median Absolute Deviation (MAD)3
Skewness6.7157102
Sum5172548.8
Variance2558662
MonotonicityNot monotonic
2024-05-07T14:03:19.720047image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4164
48.1%
1 98
 
1.1%
2 53
 
0.6%
3 41
 
0.5%
5 40
 
0.5%
4 38
 
0.4%
6 32
 
0.4%
10 30
 
0.3%
13 29
 
0.3%
7 28
 
0.3%
Other values (2100) 4106
47.4%
ValueCountFrequency (%)
0 4164
48.1%
1 98
 
1.1%
2 53
 
0.6%
3 41
 
0.5%
4 38
 
0.4%
5 40
 
0.5%
6 32
 
0.4%
7 28
 
0.3%
8 17
 
0.2%
9 26
 
0.3%
ValueCountFrequency (%)
29813 1
< 0.1%
27726 1
< 0.1%
27071 1
< 0.1%
26830 1
< 0.1%
23858 1
< 0.1%
18481 1
< 0.1%
18057 1
< 0.1%
17901 1
< 0.1%
17687 1
< 0.1%
17432 1
< 0.1%

Age_group
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size67.8 KiB
Young adults
5260 
Middle-aged
1599 
Minor
1550 
Senior
 
250

Length

Max length12
Median length12
Mean length10.389075
Min length5

Characters and Unicode

Total characters89959
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYoung adults
2nd rowYoung adults
3rd rowMiddle-aged
4th rowYoung adults
5th rowMinor

Common Values

ValueCountFrequency (%)
Young adults 5260
60.7%
Middle-aged 1599
 
18.5%
Minor 1550
 
17.9%
Senior 250
 
2.9%

Length

2024-05-07T14:03:19.984430image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-07T14:03:20.200849image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
young 5260
37.8%
adults 5260
37.8%
middle-aged 1599
 
11.5%
minor 1550
 
11.1%
senior 250
 
1.8%

Most occurring characters

ValueCountFrequency (%)
u 10520
11.7%
d 10057
11.2%
n 7060
 
7.8%
o 7060
 
7.8%
l 6859
 
7.6%
g 6859
 
7.6%
a 6859
 
7.6%
t 5260
 
5.8%
s 5260
 
5.8%
Y 5260
 
5.8%
Other values (7) 18905
21.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 89959
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
u 10520
11.7%
d 10057
11.2%
n 7060
 
7.8%
o 7060
 
7.8%
l 6859
 
7.6%
g 6859
 
7.6%
a 6859
 
7.6%
t 5260
 
5.8%
s 5260
 
5.8%
Y 5260
 
5.8%
Other values (7) 18905
21.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 89959
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
u 10520
11.7%
d 10057
11.2%
n 7060
 
7.8%
o 7060
 
7.8%
l 6859
 
7.6%
g 6859
 
7.6%
a 6859
 
7.6%
t 5260
 
5.8%
s 5260
 
5.8%
Y 5260
 
5.8%
Other values (7) 18905
21.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 89959
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
u 10520
11.7%
d 10057
11.2%
n 7060
 
7.8%
o 7060
 
7.8%
l 6859
 
7.6%
g 6859
 
7.6%
a 6859
 
7.6%
t 5260
 
5.8%
s 5260
 
5.8%
Y 5260
 
5.8%
Other values (7) 18905
21.0%

Interactions

2024-05-07T14:03:11.535151image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:57.884376image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:00.743327image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:03.186365image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:05.448999image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.274742image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:08.648983image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:10.109391image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:11.757303image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:58.184680image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:01.046981image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:03.554580image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:05.755729image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.484143image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:08.833928image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:10.289871image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:11.928892image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:58.452178image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:01.339837image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:03.802202image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:06.059819image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.656196image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.009832image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:10.459033image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:12.147529image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:58.701731image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:01.647024image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:04.084810image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:06.319925image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.790303image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.171274image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:10.642906image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:12.318738image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:59.043800image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:02.059983image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:04.363541image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:06.546491image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.960518image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.368609image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:10.827234image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:12.517261image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:59.331295image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:02.339432image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:04.676235image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:06.739973image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:08.140640image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.529224image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:11.027134image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:12.700216image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:02:59.739202image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:02.614169image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:04.928533image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:06.930650image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:08.344261image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.724168image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:11.185658image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:12.871621image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:00.439187image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:02.890048image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:05.189608image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:07.104990image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:08.493114image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:09.920605image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-07T14:03:11.336571image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-05-07T14:03:20.346098image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Age_groupCabin_deckConsumption_BasicConsumption_High_EndCryoSleepDestinationFoodCourtGroup_sizeHomePlanetRoomServiceShoppingMallSpaTransportedVIPVRDeck
Age_group1.0000.1600.0850.1080.1220.0260.062-0.1500.1370.0890.0800.0810.1170.0740.070
Cabin_deck0.1601.000-0.259-0.2580.3390.246-0.268-0.1410.754-0.055-0.041-0.2230.2210.198-0.223
Consumption_Basic0.085-0.2591.0000.6320.1730.0860.772-0.1100.2610.4020.6530.5220.0930.1410.500
Consumption_High_End0.108-0.2580.6321.0000.2180.0880.527-0.1220.2520.6110.4100.7040.2570.1170.672
CryoSleep0.1220.3390.1730.2181.0000.122-0.5450.1010.124-0.532-0.527-0.5630.4660.078-0.540
Destination0.0260.2460.0860.0880.1221.000-0.017-0.0350.2620.1020.0920.0240.1130.045-0.008
FoodCourt0.062-0.2680.7720.527-0.545-0.0171.000-0.0590.2530.1940.1990.4820.0800.1350.507
Group_size-0.150-0.141-0.110-0.1220.101-0.035-0.0591.0000.241-0.144-0.142-0.0780.1270.044-0.077
HomePlanet0.1370.7540.2610.2520.1240.2620.2530.2411.0000.1130.049-0.0030.2020.174-0.076
RoomService0.089-0.0550.4020.611-0.5320.1020.194-0.1440.1131.0000.4470.2590.1610.0520.188
ShoppingMall0.080-0.0410.6530.410-0.5270.0920.199-0.1420.0490.4471.0000.2670.0340.0070.208
Spa0.081-0.2230.5220.704-0.5630.0240.482-0.078-0.0030.2590.2671.0000.1860.0780.443
Transported0.1170.2210.0930.2570.4660.1130.0800.1270.2020.1610.0340.1861.0000.033-0.352
VIP0.0740.1980.1410.1170.0780.0450.1350.0440.1740.0520.0070.0780.0331.0000.095
VRDeck0.070-0.2230.5000.672-0.540-0.0080.507-0.077-0.0760.1880.2080.443-0.3520.0951.000

Missing values

2024-05-07T14:03:13.355391image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-07T14:03:13.736813image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CryoSleepDestinationVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckGroup_sizeHomePlanetTransportedConsumption_High_EndConsumption_BasicAge_group
0FalseTRAPPIST-1eFalse0.00.00.00.00.0B1Europa00.00.0Young adults
1FalseTRAPPIST-1eFalse109.09.025.0549.044.0F1Earth1702.034.0Young adults
2FalseTRAPPIST-1eTrue43.03576.00.06715.049.0A2Europa06807.03576.0Middle-aged
3FalseTRAPPIST-1eFalse0.01283.0371.03329.0193.0A2Europa03522.01654.0Young adults
4FalseTRAPPIST-1eFalse303.070.0151.0565.02.0F1Earth1870.0221.0Minor
5FalsePSO J318.5-22False0.0483.00.0291.00.0F1Earth1291.0483.0Middle-aged
6FalseTRAPPIST-1eFalse42.01539.03.00.00.0F2Earth142.01542.0Young adults
7TrueTRAPPIST-1eFalse0.00.00.00.00.0G2Earth10.00.0Young adults
8FalseTRAPPIST-1eFalse0.0785.017.0216.00.0F1Earth1216.0802.0Young adults
9True55 Cancri eFalse0.00.00.00.00.0B3Europa10.00.0Minor
CryoSleepDestinationVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckGroup_sizeHomePlanetTransportedConsumption_High_EndConsumption_BasicAge_group
8649FalseTRAPPIST-1eFalse86.03.0149.0208.0329.0F2Earth0623.0152.0Young adults
8650TrueTRAPPIST-1eFalse0.00.00.00.00.0G1Earth10.00.0Young adults
8651FalseTRAPPIST-1eFalse0.00.00.00.00.0A3Europa10.00.0Minor
8652FalseTRAPPIST-1eFalse1.01146.00.050.034.0A3Europa085.01146.0Young adults
8653FalseTRAPPIST-1eFalse0.03208.00.02.0330.0A3Europa1332.03208.0Young adults
8654False55 Cancri eTrue0.06819.00.01643.074.0A1Europa01717.06819.0Middle-aged
8655TruePSO J318.5-22False0.00.00.00.00.0G1Earth00.00.0Young adults
8656FalseTRAPPIST-1eFalse0.00.01872.01.00.0G1Earth11.01872.0Young adults
8657False55 Cancri eFalse0.01049.00.0353.03235.0E2Europa03588.01049.0Young adults
8658FalseTRAPPIST-1eFalse126.04688.00.00.012.0E2Europa1138.04688.0Middle-aged

Duplicate rows

Most frequently occurring

CryoSleepDestinationVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckGroup_sizeHomePlanetTransportedConsumption_High_EndConsumption_BasicAge_group# duplicates
293TrueTRAPPIST-1eFalse0.00.00.00.00.0G1Earth10.00.0Young adults175
269TrueTRAPPIST-1eFalse0.00.00.00.00.0F1Mars10.00.0Young adults174
289TrueTRAPPIST-1eFalse0.00.00.00.00.0G1Earth00.00.0Young adults122
169TruePSO J318.5-22False0.00.00.00.00.0G1Earth10.00.0Young adults109
291TrueTRAPPIST-1eFalse0.00.00.00.00.0G1Earth10.00.0Minor74
274TrueTRAPPIST-1eFalse0.00.00.00.00.0F2Mars10.00.0Young adults63
205TrueTRAPPIST-1eFalse0.00.00.00.00.0B2Europa10.00.0Young adults59
165TruePSO J318.5-22False0.00.00.00.00.0G1Earth00.00.0Young adults58
277TrueTRAPPIST-1eFalse0.00.00.00.00.0F3Mars10.00.0Minor53
139True55 Cancri eFalse0.00.00.00.00.0G1Earth10.00.0Young adults52